Handling Conjunctions in Named Entities
نویسندگان
چکیده
Named entity recognition consists of identifying ‘mentions’ — strings in a text that correspond to named entities — and then classifying each such mention as corresponding to a specific type of named entity, with typical categories being Company, Person and Location. The full range of named entity categories to be identified is usually application dependent. Introduced for the first time as a separately evaluated task at the Sixth Message Understanding Conference in 1995 (see, for example, [Grishman and Sundheim, 1995; 1996]), named entity recognition has attracted a considerable amount of research effort. Initially handled with hand crafted rules (as, for example, in many of the participating systems in MUC-6 and MUC-7) and later by means of statistical approaches (see, for example, [Sang, 2002; Sang and Meulder, 2003]), the state-of-the-art provides high performance for named entity identification and classification both for specific domains and for languageand domain-independent systems. However, our experience with existing software tells us that there are still some categories of named entities that remain problematic. In particular, very little work has explored the problem of the potential ambiguity of conjunctions appearing in named entity strings. Consider, for example, a string like the following:
منابع مشابه
Disambiguating Conjunctions in Named Entities
The recognition of named entities is now a welldeveloped area, with a range of symbolic and machine learning techniques that deliver high accuracy identification and categorisation of a variety of entity types. However, there are still some named entity phenomena that present problems for existing techniques; in particular, relatively little work has explored the disambiguation of conjunctions ...
متن کاملNamed Entity Extraction with Conjunction Disambiguation
The recognition of named entities is now a well-developed area, with a range of symbolic and machine learning techniques that deliver high accuracy extraction and categorisation of a variety of entity types. However, there are still some named entity phenomena that present problems for existing techniques; in particular, relatively little work has explored the disambiguation of conjunctions app...
متن کاملPAYMA: A Tagged Corpus of Persian Named Entities
The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...
متن کاملRecognizing Names by the Meaning of Defining Syntactic Structures
We present a named entity recognition system that relies on interpreting the meaning of the syntactic structures that define named entities. The syntactic heads of the defining structures are learned from the training data and are later used to interpret the meaning of the syntactic structures in the test data. Semantic features are extracted based on the interpretations and are used to build a...
متن کاملA Supervised Machine Learning Approach to Conjunction Disambiguation in Named Entities
Although the literature contains reports of very high accuracy figures for the recognition of named entities in text, there are still some named entity phenomena that remain problematic for existing text processing systems. One of these is the ambiguity of conjunctions in candidate named entity strings, an all-too-prevalent problem in corporate and legal documents. In this paper, we distinguish...
متن کامل